A nomogram predicting the severity of COVID-19 based on initial clinical and radiologic characteristics

Aim: This study aimed to build an easy-to-use nomogram to predict the severity of COVID-19. Patients & methods: From December 2019 to January 2020, patients confirmed with COVID-19 in our hospital were enrolled. The initial clinical and radiological characteristics were extracted. Univariate and multivariate logistic regression were used to identify variables for the nomogram. Results: In total, 104 patients were included. Based on statistical analysis, age, levels of neutrophil count, creatinine, procalcitonin and numbers of involved lung segments were identified for nomogram. The area under the curve was 0.939 (95% CI: 0.893–0.984). The calibration curve showed good agreement between prediction of nomogram and observation in the primary cohort. Conclusion: An easy-to-use nomogram with great discrimination was built to predict the severity of COVID-19.


Study design & participants
From 31 December 2019 to 22 January 2020, all consecutive patients admitted to the Zhongnan Hospital of Wuhan University and confirmed with COVID-19, were enrolled. Those who were admitted to the intensive care unit that required mechanical ventilation or had a fraction of inspired oxygen concentration of at least 60% or more, were classified as severe/critical type [9]. The other patients were classified as common types. All patients were followed-up until hospital discharge, death or the last follow-up date of 29 February 2021.

Data collection
The clinical characteristics including demographic data, medical history, epidemiological characteristics, underlying co-morbidities, clinical symptoms and signs, and laboratory findings were extracted from electronic medical records. Two investigators independently reviewed the data collection forms to verify data accuracy. The laboratory tests consisted of complete blood count, coagulation function, liver and renal function, C-reactive protein (CRP), erythrocyte sedimentation rate, procalcitonin, lactate dehydrogenase and creatine kinase.
All CT images were independently reviewed by two experienced chest radiologists (H Zhang and F Zhong) on an image archiving and communication system. Decisions were reached by consensus. Lesion density was classified as GGO, defined as increased lung parenchymal attenuation that did not obscure the underlying vascular architecture; and consolidation was defined as opacification in which the underlying vasculature was obscured [17]. Lesion location was classified as peripheral if it was in the outer a third of the lung; otherwise, it was defined as dispersed. Lesion size was defined as the largest size of lesions and was classified as small (diameter: <1 cm), medium (diameter: 1-3 cm), large (diameter: 3 cm-50% of the segment) or segmental (50-100% of the segment) [18]. The number of involved lung segments was recorded. Other CT features such as intralobular and interlobular septal thickening, air bronchogram, pleural effusion and a short axis diameter of mediastinal lymph nodes larger than 1 cm were also recorded.

Statistical analysis
Categorical variables were described by frequency rates and percentages; and compared between severe/critical and common type COVID-19 using the χ 2 test or Fisher's exact test. A two-sided α of less than 0.05 was considered statistically significant and included in the multivariate logistic regression analysis. Likelihood ratio multivariate logistic regression analysis was used to identify predictors of severity of COVID-19. If not otherwise specified, the variables with p < 0.2 were selected into a relatively parsimonious model as the final model. The above statistical analyses were performed using the SPSS software, version 21.0 for windows (SPSS Inc).
A nomogram was built based on the results of the logistic regression analysis by using R, version 3.6.3 with the rms statistical packages. The area under the curve (AUC) was used to evaluate the performance of the model to predict the risk of severity transformation. The closer the AUC is to 1, the closer the model is to reality. Moreover, during the internal validation of the nomogram, a calibration curve was plotted based on 1000 bootstrap resampling to assess the predictive accuracy of the nomogram [19]. The closer the calibration curve is to the 45-degree diagonal line, the better the performance of the prediction model.

Results
In our hospital, 133 consecutive patients were confirmed with COVID-19 by pharyngeal swabs samples for real-time RT-PCR. Seventeen patients without chest CT examination within 3 days of admission and 12 nonhospitalized patients were excluded. Eventually, 104 patients with chest CT scans were included in our study. All patients lived in Wuhan, China. Chest CT scans were performed on four different CT machines: 74 on a Revolution HD (GE Healthcare, WI, USA), seven patients on an Ingenuity (Philips Healthcare, MA, USA); 15 on a Somatom Definition (Siemens, Erlangen, Germany) and eight patients on a uCT 750 (United Imaging, Shanghai, China,). The slice thickness of CT images was 1 mm, and CT images were reviewed on lung window (window level -440, window width 1500) and mediastinal window (window level 40, window width 400).

Clinical characteristics
The clinical characteristics of included cases are shown in Table 1  with severe/critical type passed away; the others were cured and discharged from the hospital. Thirty-nine (31.0%) patients had co-morbidities, including eight (7.7%) chronic pulmonary diseases, ten (9.6%) heart diseases, ten (9.6%) diabetes, 23 Figure 1 shows the typical chest CT findings of a COVID-19 patient. One patient with chronic lymphocytic leukemia and one patient with lung cancer had enlarged mediastinal lymph nodes and little pleural effusion. Two patients with chronic obstructive pulmonary disease had enlarged mediastinal lymph nodes.

Nomogram building & validation
Based on the results of regression analysis and clinical consideration, a nomogram was built that incorporated five variables (age, neutrophil count, creatinine, number of involved lung segments and procalcitonin) to predict the severity of COVID-19 ( Figure 2). A total score was calculated by adding every single score from the independent five variables. By projecting the total score to the lowest scale, we were able to estimate the severity of COVID-19.  20  25  30  35  40  45  50  55  60  65  70  75  80  85  90  95   30  40  50  60  70  80  90   The AUC was 0.939 (95% CI: 0.893-0.984) in the training set before the bootstrap technique was applied, showing good discrimination of the severity of COVID-19. The calibration curve showed a good agreement between the prediction of the nomogram and the observations in the primary cohort (Figure 3).

Discussion
The COVID-19 pandemic has become a global public health issue, threatening millions of lives worldwide. As different clinical types of COVID-19 require different treatments and care plans, and the mortality of severely ill patients with COVID-19 is considerable [14,15], it is of vital importance to find a way to predict the risk of disease progression. Thus, in this study, we compared the differences in clinical and radiological characteristics in severe/critical and common type COVID-19 patients, and reported a simple and easily available nomogram, based on the initial clinical and radiological characteristics, to predict the severity of COVID-19. The nomogram incorporates five variables (age, neutrophil count, creatinine, procalcitonin and number of involved lung segments) and has a high discrimination rate of 93.9%.
Nguyen et al. [21] reported a nomogram that only included clinical characteristics (age, respiratory rate, overweight, temperature, CRP, troponin and lymphocyte counts) and the C-statistics of their final model is 0.75. Yu et al. [22] 10.2217/fvl-2020-0193 Future Virol. (Epub ahead of print) future science group reported a nomogram that only included age and CT features (density, mosaic perfusion sign and severity score of the lung) to predict the severity of COVID-19 and the AUC of the model was 0.929. In our study, both clinical characteristics (age, neutrophil count, creatinine and procalcitonin) and CT features (number of involved lung segments) were included to build the predictive model, with an AUC of 0.939. Other studies also reported models based on deep-learning algorithms using CT images to predict the severity of COVID-19 and showed a great accuracy [23][24][25]. However, these models were more difficult to build because they need complicated algorithms, lesion annotation for each object to guide learning of the algorithm or an artificial intelligence-assisted method to label information. Hence, these models are not available for every institution and clinician. In our study, only the original clinical characteristics (age, neutrophil count, creatinine and procalcitonin) and chest CT characteristic (numbers of involved lung segments), which were easily available and identified in clinical, were needed for the predictive mode. The nomogram was a reliable tool that can create a simple intuitive graph of a statistical predictive model that quantifies the risk of a clinical event [26]. Thus, the model built by us was more user-friendly and with a high discrimination rate.
As reported by previous studies, the most common symptoms of COVID-19 were fever and cough, and the most common chest CT characteristics were GGO [4,6,27]. By univariable analysis, we found that age, co-morbidities, leukocyte count, lymphocyte count, neutrophil count, procalcitonin, aspartate transaminase, lactate dehydrogenase, albumin, creatinine, lesion size and number of involved lung segments were related to the severity of COVID-19, which also had been reported before [4,6,27]. The results showed that patients with severe clinical type may have impaired liver and renal functions, activating the body's inflammatory response. Lymphocytopenia was a prominent feature of severe clinical type, which may be caused by necrosis or apoptosis of lymphocytes; this has also been found in Middle East respiratory syndrome and severe acute respiratory syndromes [28,29]. Men more frequently developed the severe clinical type, but the difference was not significant [14]. Except for dyspnea, no other clinical symptom was significantly related to the severity of COVID-19, which made the discrimination more difficult. In the cases that involved more lung segments initially, the virus was more widely distributed in the lungs and these patients were more likely developed to severe-type cases. The larger the lesion, the larger the extent of the lung was involved, and the more likely the case developed to severe/critical type.
After multivariable analysis, we found that old age, increased neutrophil count, increased creatinine and involvement of more lung segments were independent risk factors of severe/critical COVID-19. An elevated serum procalcitonin level was correlated to the severity of microbial invasion and usually occurred in severe shock, systemic inflammatory response syndrome and multiple organ dysfunction syndrome [20]. Considering this, procalcitonin was added to build the nomogram. Because it was recommended that selected variables should be based on either previous research or clinical reasoning, so that excluding variables because of missing data are eliminated and consistent data collection is maintained [30]. Additionally, the model showed great discrimination of the severity of COVID-19, with an AUC of 0.939.
Some limitations of this study should be acknowledged. First, this is a monocentric retrospective study and the study population is relatively small, which could inevitably result in some bias. In the future, multicentric studies should be conducted to verify this model. Second, because of the small sample size, it is difficult to set up a validation cohort to assess the predictive accuracy of our nomogram. However, we used a bootstrap resampling cohort for internal validation. Moreover, the calibration curve showed a great agreement between the prediction of the nomogram and the observations in the primary cohort.

Conclusion
We reported a simple and easy-to-use model (nomogram) that is based on the initial clinical and chest CT characteristics to predict the severity of COVID-19 and with a high discrimination rate of 93.9%. This nomogram can help clinicians to easily and quickly identify patients who may progress to a severe/critical clinical type at the beginning of admission, and timely choose available treatments, thus, helping clinicians to be better prepared for the severe condition of COVID-19. Additionally, under such a severe epidemic, early intervention may alter the outcomes for patients, who may develop to severe conditions.